Development on the PHIS Information System
As a computer engineering study engineer at MISTEA, I had the opportunity to work on a stimulating project between June 2017 and April 2018, located on the Gaillarde campus. The main challenge of this mission was the integration of a workflow manager to optimize data processing pipelines in the PHIS information system.
This project was particularly important as it also aimed to result in the writing of a scientific article that would contribute to PHIS's visibility and credibility within the scientific community.
I had the privilege of collaborating with the PHIS development team and the management team of the LEPSE high-throughput phenotyping platform, which enriched my professional experience.
Tasks & Objectives
During this mission, I took on several roles, each bringing its own challenges and learnings. My responsibilities included ticket management and evolutionary maintenance of the PHIS information system. I also conducted an in-depth analysis of the various available workflow managers to compare their functionalities and identify the one that would integrate best with PHIS. One of my main objectives was to integrate this manager and develop the missing functionalities necessary for writing our article. The success criteria were clear: not only did we have to publish the article, but we also had to make an informed choice for the workflow manager that would meet the project's scientific needs.
Actions and Development
To achieve these objectives, I implemented a series of concrete actions. I first resolved bugs and managed incoming tickets, which allowed me to familiarize myself with the existing system. Subsequently, I analyzed four workflow management software — Galaxy, Knime, Taverna, and gUse — and presented my findings during a webinar for my colleagues. The Galaxy integration prototype in PHIS was a crucial step, illustrating my efforts to provide concrete solutions. This collaboration with the LEPSE team, particularly the platform director, allowed me to understand the field challenges and resolve information system issues effectively.
Results
The results of my actions were very satisfactory. I had the opportunity to present a webinar on workflow managers during a meeting of the CATI (Automated Information Processing Center) of INRAE, which allowed us to share our progress and exchange ideas with other professionals in the field. The publication of our article in the New Phytologist Foundation journal also marked an important milestone in our work, as evidenced by the following link: article here. In the long term, the PHIS system continued to evolve, transforming into OpenSilex, an initiative that demonstrates the sustainability and impact of our efforts.
Beyond tangible results, this experience offered me valuable lessons. I gained a deep understanding of workflows, particularly the importance of clear vocabulary for structuring the different stages of data processing. I also realized that granularity in workflow design can influence system complexity. Indeed, a judicious choice regarding the level of detail of components can simplify or complicate operations. Thus, rather than directly integrating Galaxy workflows into PHIS, it would be wiser to use wrappers to efficiently manage inputs and outputs.
Technical Stack
The project relies on the following tools and technologies:
- Backend : PHP, Yii2
- Frontend : HTML, JavaScript
- Scripting languages : R, Python
- Databases : MongoDB, PostgreSQL
- Workflow manager : Galaxy
It is important to note that this technical stack was inherited from the existing PHIS system. The major technical challenges encountered include:
- Complex inherited code, particularly a 600-line function with multiple levels of conditions and nested loops
- Integration of workflow managers in an existing scientific information system
This experience at MISTEA not only allowed me to apply my technical skills but also taught me the importance of collaboration and communication in the success of a scientific project.